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Abstract 

A modularity-specialized label propagation algorithm (LPAm) for detecting 
network communities was recently proposed. This promising algorithm of- 
fers some desirable qualities. However, LPAm favors community divisions 
where all communities are similar in total degree and thus it is prone to get 
stuck in poor local maxima in the modularity space. To escape local max- 
ima, we employ a multistep greedy agglomerative algorithm (MSG) that can 
merge multiple pairs of communities at a time. Combining LPAm and MSG, 
we propose an advanced modularity-specialized label propagation algorithm 
(LPAm+). Experiments show that LPAm+ successfully detects communi- 
ties with higher modularity values than ever reported in two commonly used 
real-world networks. Moreover, LPAm+ offers a fair compromise between 
accuracy and speed. 
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1. Introduction 

Detecting communities in networks has attracted a great deal of interest 
recently. Informally, a community is a densely connected subnetwork that is 
only sparsely linked to the remaining network. It is said that constructing 
algorithms for detecting communities is of great importance as it provides 
insight into the structures of real- world systems 
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Modularity [2| is a scalar value that measures the quality of a particular 
division of a network into communities. Among various kinds of methods for 
detecting network communities, one that is widely used is modularity opti- 
mization jsl. The modularity optimization method detects communities by 
searching over possible divisions of a network for one that have particularly 
high modularity. Since finding the "best" community division with the high- 
est modularity value is proven to be NP-hard [3], exhaustive search over all 
possible divisions is in general intractable. Therefore, all of the modularity 
optimization algorithms are based on approximate optimization. 

Recently Raghavan, Albert et al. propose a label propagation algorithm 
(LPA) jsf for detecting network communities. This innovative and promising 
algorithm uses only the network structure as a guide, and can detect com- 
munities at very high speed. Barber and Clark extend LPA by relating it to 
modularity, and introduce a modularity-specialized LPA (LPAm) |6|]. How- 
ever, it is found that LPAm is prone to get stuck in poor local maxima in the 
modularity space joj . To detect communities with high modularity values, we 
improve LPAm by driving it out of local maxima and devise a new algorithm 
called LPAm-|- in this paper. Experiments show that LPAm+ successfully 
detects communities with the highest modularity values in several commonly 
used real-world networks. 

The structure of the paper is as follows: in the next section, the definition 
of modularity is reviewed. In section 121 we give a survey of LPA and LPAm. 
Our algorithm is proposed in section HJ Experiments are shown in section El 
followed by a conclusion and discussion in the last section. 

2. Modularity 

To evaluate the goodness of a particular division of a network into com- 
munities, Newman introduces a measure called modularity Consider a 
(undirected and unweighted) network with n nodes and m edges represented 
by an adjacency matrix A, whose element Auv is equal to 1 if there is an 
edge between nodes u and f, and otherwise. The degree of a node u is 
denoted by ku. Suppose a particular division of the network into Nc com- 
munities, such that each node u is assigned to a community L. Modularity 
essentially measures the actual fraction of intra-community edges minus its 
expected value in a null model, where the community division is the same 
but connections are made randomly between nodes. Formally, modularity is 
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defined as: 



Q — 7; — 7 (A^^, — P„^)5(/u, (1) 

u,v=l 

where P^^ = kuky/2m is probability in the null model that an edge exists 
between nodes u and v, and is the Kronecker's delta. Further, a 

modularity matrix B is defined with elements B^^ = A^v — Puv Hence, 
modularity is expressed as: 

1 " 

Q = ^y2BuJiiu,iv). (2) 

u,v=l 

We can also reformulate modularity as the addition of contributions over all 
communities: 

(3) 

where It is the number of intra-community edges that have both ends in 
community t, Dt the sum over all degrees of nodes in community t. 



3. LPA 

In this section, we give a survey of LPA and LPAm, which are the bases 
of the following discussion. 

3.1. LPA 

The idea of LPA is simple [slj: initially each node in the network is as- 
signed with a unique label, indicating the community it belongs to. At every 
label propagation step, each node sequentially updates its label to a new one 
which is the most frequent label among its neighbors. Formally, the label 
updating rule for node x is: 

= arg max A^J{1^, /) j , (4) 

where l^^^^ indicates new label for node x. If more than one label are the 
most frequent ones, the new label is chosen randomly from them. The label 



3 



propagation step is performed iteratively until each node has a label that is 
(one of) the most frequent label(s) of its neighbors. Finally communities are 
identified as groups of nodes bearing the same labels. 

The most striking feature of LPA is its less expensive computation than 
what is possible so far (near linear time complexity) |l5|. The weakness is 
that LPA is not stable: the algorithm is sensitive to the order in which node 
labels are updated in each step. Thus the solutions (and their corresponding 
modularity values) can be quite different in different runs 01 . Sometimes 
LPA may even end up with a trivial solution — all nodes are identified in the 
same community 0. 

3.2. LP Am 

Barber and Clark extend LPA by modifying the label updating rule so 
that modularity can be maximized, and propose a new algorithm called 
LPAm j6|. We can rewrite ([2]) by separating elements regarding the label 
of node x from others, yielding: 



When updating the label for x, by selecting a new label that maximizes the 
second term on the right hand side of ([5]), we actually maximize Q. Hence, 
to consider updating the label for node x, the updating rule of LPAm is: 



Implementing LPAm would bring about a monotone increase in modular- 
ity, hindering the trivial solution being formed. Besides, LPAm preserves 
the merit of high speed of LPA. But LPAm is prone to get stuck in poor 
local maxima in modularity space, with a similar total degree of nodes in 
different communities j^. Moreover, LPAm still suffers from the weakness of 
instabihty. 

4. LPAm+ 

In this section, we first give an example in which LPAm gets stuck in a 
local maximum, then we introduce how to escape the local maximum and 
propose our improved algorithm LPAm+. 
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(a) (b) 




(c) (d) 

Fig. 1. A toy network, (a) The network is intuitively divided into two 
communities, (b) LPAm gets stuck in the poor local maximum where the 
network is divided into 4 communities and the modularity is 0.399. (c) 
We escape the local maximum descried in (b) by merging the community 
labeled 'a' and 'e', with modularity increased by 0.008. (d) After carrying 
out LPAm again, we climb onto another local maximum, which is also the 
global maximum, with modularity increasing from 0.407 to 0.413. 



Take the toy network shown in Fig. 1(a) as an example. This network is 
intuitively divided into two communities (painted in yellow and green colors 
respectively), with its modularity equal to 0.413. Feeding LPAm with this 



network, we obtain a division into four communities (Fig. 1(b)) and its 
modularity is 0.399. Evidently this division corresponds to a local maximum 
in the modularity space. Under the label updating rule ([6]), LPAm favors 
community divisions where all communities are similar in total degree, which 
immediately leads to the separation of communities {0-3}, {4,5} and {6-9}. 

To escape the local maximum, we have to get rid of the current constraint. 
Note that (EI) is a modularity maximization rule based on local structure of 
the network. Viewing broadly, we can adopt the greedy rule for merging 
communities that maximizes modularity: when LPAm gets stuck in a local 
maximum (no modularity gain can be achieved from further label propaga- 
tion), we calculate the modularity changes for merging pairs of communities, 
and merge those pairs that improve modularity most. In real operation, we 
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employ the technique used in the multistep greedy agglomerative algorithm 
(MSG) [sl that promotes simultaneously merging of multiple pairs of com- 
munities at a time, under the following criteria: suppose ti and t2 is a pair 
of communities to be merged; neither ti nor t2 is present in another pair 
inducing a higher modularity changes (see Appendix for additional details). 

After merging communities, we escape the local maximum. Then we 
should carry out another round of LPAm. This is analogous to climbing 
onto another local maximum. However, it is not guaranteed that the new 
local maximum we arrived at is good enough (although it is better than 
the previous local maximum). Hence we should repeat the above process 
(escaping the local maximum and climbing onto another local maximum) for 



many times, until no improvement of modularity can be reached. Fig. 1 (b) 



1(d) give an illustration for LPAm+ working in the toy network. The pseudo- 



code of LPAm+ is presented in Algorithm [H It is clear that LPAm+ brings 
a monotone increase in modularity. Since the time complexity estimation of 
LPAm+ is a little complex, we left it to the next section. 



Algorithm 1 LPAm+. 

1: Each node is assigned with a unique label 

2: maximize modularity by LPAm 

3: while 3 community pair (^i, ^2) with AQt^tj > do 

4: for every community pair (ti,t2): (AQtita > 0) A [!3i : {AQui > AQt^ta) v (AQ^^ > 

AQtiiJ] do 
5: merge communities ti and t2', 

6: end for 

7: maximize modularity by LPAm 
8: end while 

* AQtita denotes the modularity change for merging community ti and ^2- 



5. Experiments 

We test LPAm+ in several real-world networks that are commonly used by 
other researchers for evaluating modularity optimization algorithms. These 
networks include: the karate club network (Karate Club) joj, the dolphin 
association network (Dolphins) (lol |. the network of co-purchased political 
books (Political Books) the network of games between college football 
teams (College Football) [l2|, the network of collaborations between jazz 
musicians (Jazz) [isj, the network of metabolic reactions in Caenorhabdi- 
ties elegans (C. elegans) the network of email contacts at a university 
(E-mail) [lij, the Pretty Good Privacy web of trust social network (PGP) 
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Table 1. The numbers of nodes and edges of the networks in our experiment. 



Network 


# of nodes 


# of edges 


Karate Club 


34 


78 


Dolphins 


62 


159 


Political Books 


105 


441 


College Football 


115 


613 


Jazz 


198 


2,742 


C. elegans 


453 


2,025 


E-mail 


1,133 


5,451 


PGP 


10,680 


24,316 


Condmat2003 


27,519 


116,181 



16[, and the network of co-authorships for e-print papers posted to the con- 
densed matter archive (Condmat2003) [13] • As most of the researchers did, 
we uniformly treat all networks as undirected and unweighted, and exclude 
all self-loop edges. Table [1] lists the numbers of nodes and edges after pre- 
processing the data. 

We apply LPAm and LPAm+ one hundred times to each of the net- 
works. Table |2] shows the maximal modularity, the average modularity, the 
standard deviation of modularity, and the average execution time collected 
from samples. We can see that both the maximal modularity and the average 
modularity obtained by LPAm+ are markedly higher than those by LPAm, 
consistently in all of the networks. This implies the success of our trick for 
escaping the local maxima. For the index of the standard deviation of mod- 
ularity, we can find that LPAm+ value is significantly smaller than that of 
LPAm. As a matter of fact, normally the difference of modularity values 
between solutions of LPAm-|- in different runs is within 1%. Even in extreme 
cases, the difference between the worst and the best modularity values is no 
more than 5%. Therefore, LPAm+ is much more stable than LPAm. 

Fig. [2] portrays the running time of LPA and LPAm+ in networks of dif- 
ferent sizes. In the following, we give a time complexity analysis for LPAm-(-. 
On the one hand, one step of label propagation in LPAm costs 0(m) time 
[sj, so the time complexity of LPAm is 0(rm), where ris the number of label 
propagation steps required to reach a local maximum in modularity space. 
On the other hand, one round of merging pairs of communities that corre- 
sponds to the for-loop in Algorithm [1] requires a time of O(mlogn) [3]. Let 
h denote the number of needed iterations for the while-loop in Algorithm [1] 
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Table 2. Comparisons between LPAm and LPAm+. Values are collected from 
one hundred runs for each network. Qmax denotes the maximal modularity 
value, Qavg the average modularity value, a the standard deviation of the 
modularity value, and t the average execution time (in seconds, on a PC 
with Intel Core 2 Duo CPU @ 2.53GHz). 



LPAm LPAm+ 



Network 


Qmax 


Qavg 


fj 


t 


Qmax 


Qavg 


cr 


t 


Karate Club 


0.399 


0.352 


0.0277 


0.009 


0.420 


0.418 


0.0061 


0.014 


Dolphins 


0.516 


0.495 


0.0076 


0.019 


0.529 


0.523 


0.0023 


0.034 


Political Books 


0.522 


0.493 


0.0199 


0.048 


0.527 


0.527 


0.0011 


0.088 


College Football 


0.604 


0.579 


0.0182 


0.049 


0.605 


0.604 


0.0018 


0.080 


Jazz 


0.445 


0.436 


0.0092 


0.229 


0.445 


0.444 


0.0013 


0.368 


C. elegans 


0.409 


0.379 


0.0138 


0.354 


0.452 


0.441 


0.0045 


1.247 


E-mail 


0.537 


0.496 


0.0155 


1.097 


0.582 


0.576 


0.0028 


3.589 


PGP 


0.726 


0.705 


0.0085 


5.396 


0.884 


0.882 


0.0009 


114.221 


Condmat2003 


0.582 


0.568 


0.0036 


31.952 


0.755 


0.751 


0.0012 


461.599 




IOe+1 10e+2 10e+3 10e+4 10e+5 10e+6 

Network Size (m) 



Fig. 2. Comparison of running time for LPAm and LPAm+ in networks of 
different sizes (on a PC with Intel Core 2 Duo CPU @ 2.53GHz). 
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The overall time of LPAm+ can be written as 0(rm)+/i(0(mlogn)+0(rm)). 

An exact estimation of h is not possible, as it depends on the quality of 
the intermediate solution obtained by LPAm. Suppose d is defined as the 
depth of the dendrogram describing the community structure. The number 
of merging rounds for a single MSG algorithm (the step width is set to be 
+00) would be d. The number of merging rounds in our algorithm, namely h, 
seems a little obscure, since LPAm is performed after each merging rounds. 
However, note that only two cases can happen during the label propagation 
process in LPAm: some communities disappear and the remaining commu- 
nities exchange parts of their nodes with each other0 Hence we can safely 
arrive at: hr^d. In Table El we list the true values of h when LPAm+ is 
applied to the various networks mentioned above. 

As for r, it is still not very well understood. In jij, the authors suggests 
that the number of label propagation steps required for LPA algorithm to 
converge is independent to the number of nodes, and after 5 steps 95% of the 
nodes can get their "right" labels. We show the actual values of r obtained 
from running LPAm+ in real-world networks in Table El It seems that r is 
bounded by a small constant. Therefore, r=o(log?7,). 

Taken all together, in a hierarchical network where d^logn, LPAm+ re- 
quires an overall time of O(mlog^n). This scaling is the same as MSG Q 
and the classical greedy agglomerative algorithm 18 . 

To compare the performance of LPAm+ with other algorithms, in Table 
IHwe include the (maximal) modularity values obtained by LPAm+ and by 
many previously published methods in these networks. These methods are, 
in order, the hybrid algorithm of MSG algorithm in combination with node 
moving refinement algorithm proposed by Schuetz and Cafiisch (MSG-VM) 
jsj , the hybrid algorithm of single-step greedy agglomerative algorithm by sig- 
nificance in combination with multilevel node moving refinement algorithm 
advanced by Noack and Rotta (SS-ML) [l9[, the greedy agglomerative algo- 



rithm put forward by Clauset, Newman and Moore (Greedy) [18|, the math 



^In theory, there are three cases during the label propagation process in LPAm: 1) 
existent communities disappear (this situation happens when all nodes of a community 
select the labels of other communities as their new labels); 2) communities exchange part of 
their nodes with each other (this situation happens when a part of nodes of one community 
select the labels of other communities as their new labels); 3) new communities appear 
(this situation happens when some nodes select unused labels as their new labels). But 
case 3) never happens in practice. 
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Table 3. The average number of label propagation steps required for the 
embedded LPAm to converge, denoted by r, and the number of iterations for 
the while-loop, denoted by /i, when LPAm+ is applied to real- world networks. 
Values are averaged over one hundred runs in each of the networks. The 
uncertainty of the final digit, calculated as the standard error of the mean, 
is shown parenthetically. 



Network 



Karate Club 


6.13(6) 


1.02(1) 


Dolphins 


5.66(4) 


1.71(4) 


Political Books 


6.52(6) 


1.98(2) 


College Football 


5.45(4) 


1.02(1) 


Jazz 


6.93(9) 


1.54(4) 


C. elegans 


6.17(5) 


6.95(9) 


E-mail 


7.11(6) 


6.56(8) 


PGP 


4.61(1) 


73.8(2) 


Condmat2003 


5.59(2) 


55.9(7) 



ematical programming approach proposed by Agarwal and Kempe (VP/LP) 
[2o| . the extremal optimization algorithm introduced by Duch and Arenas 
(EO) [21I, the simulated annealing implementation proposed by Guimera 
and Amaral (SA) jiij, and the spectral optimization method suggested by 
Newman (SO) [3]. 

To make it clearer, in Table O we summarize the best solutions obtained 
by LPAm+ and the ones with the highest modularity values ever reported 
in these networks. It is found that, for eight of the nine networks consid- 
ered here (Karate Club, Dolphins, Political Books, College Football, Jazz, 
C. elegans, E-mail and PGP), LPAm+ finds the highest modularity values. 
Especially, for two networks (C. elegans and E-mail), LPAm+ finds modu- 
larity values higher than previously published. Only in the Condmat2003 
network, LPAm+ is outperformed by SS-ML algorithm. It is interesting to 
note that SS-ML, which employs the single-step greedy agglomerative algo- 
rithm followed by the multilevel node moving refinement algorithmic, achieves 



much higher modularity value than other algorithms in this network. In [19 



the devisers of SS-ML argue that MSG algorithm is generally less effective 



^This algorithm is designed to improve modularity by "adjusting" misplaced nodes. 
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Table 4. The (maximal) modularities obtained by LPAm+ and many previ- 
ously published methods. 



Network LPAm+ MSG-VM SS-ML Greedy VP/LP EO SA SO 



Karate Club 


0.420 


0.398 


0.420 


0.381 


0.420 


0.419 


0.420 


0.419 


Dolphins 


0.529 




0.528 




0.529 




0.528 


0.489 


Political Books 


0.527 




0.527 




0.527 




0.527 


0.399 


College Football 


0.605 


0.603 


0.600 


0.556 


0.605 




0.605 


0.602 


Jazz 


0.445 


0.445 


0.445 


0.439 


0.445 


0.445 


0.445 


0.442 


C. elegans 


0.452 


0.450 


0.446 


0.412 


0.450 


0.434 


0.450 


0.435 


E-mail 


0.582 


0.575 


0.577 


0.503 


0.579 


0.574 


0.579 


0.572 


PGP 


0.884 


0.878 


0.884 


0.849 




0.846 




0.855 


Condmat2003 


0.755 


0.748 


0.814 


0.661 




0.679 




0.723 



than the single-step greedy agglomerative algorithm. Perhaps the reason 
that LPAm+ does not work well in the Condmat2003 network is that its 
component MSG, with a too aggressive strategy, diverts the algorithm to a 
suboptimal portion of the solution space. It is also worthwhile to note that 
that the VP/LP [20] and SA 22] algorithms can as well find the highest mod- 
ularity values in some of the networks. But they are computationally much 
expensive and do not scale to larger networks like PGP and Condmat2003. 
Therefore, though not as fast as LPAm which is noted for its speed, LPAm-|- 
offers a fair compromise between accuracy and speed. 



6. Conclusion and discussion 

In this paper, we introduce a new community detection algorithm LPAm+ 
based on the previously proposed algorithm LPAm. The main idea is that 
we try to drive LPAm out of local maxima and hereby employ MSG to 
merge pairs of communities which are similar in total degree. Experiments 
show that LPAm+ improves LPAm in terms of modularity of the detected 
communities, with extra computational time. Besides, LPAm+ is more stable 
than LPAm. Compared with other algorithms, LPAm+ distinguishes itself 
by its accuracy (measured by modularity) while preserving relatively high 
speed. The fact that LPAm+ detects the highest modularity values in almost 
all of the test networks is impressive. 

It should be noted that the speed of LPAm+ can still be substantially 
improved. First, when updating the label for a node in LPAm, candidates of 



11 



Table 5. Comparison between the solution with maximal value of modularity 
obtained by LPAm+ and the one with the highest modularity ever reported. 
Nc is the number of detected communities. Q denotes the modularity value. 
Sources indicate the referenced papers where we collected the data. 



LPAm+ Published Algorithms 



Network 


A^c 


Q 


Nc 


Q 


Sources 


Karate Club 


4 


0.420 


4 


0.420 


[19], [20], [2^, [25] 


Dolphins 


5 


0.529 


5 


0.529 


[20], [26] 


Political Books 


5 


0.527 


5 


0.527 


[19], [20] 


College Football 


10 


0.605 


10 


0.605 


[20], [25] 


Jazz 


4 


0.445 


4(5) 


0.445 


[8], [19], [20], [21] 


C. elegans 


9 


0.452 


11 


0.450 


[8],[20] 


E-mail 


10 


0.582 


11 


0.579 


[20] 


PGP 


99 


0.884 


93 


0.884 


[19] 


Condmat2003 


72 


0.755 


76 


0.814 


fl9] 



the new label can be safely confined to the labels of the neighbors of that node 
and an unused label \0\ (further, we find through experiments that unused 
label is never selected as a new label). In light of this, we can only update 
the labels of nodes whose neighbors had a label change. This means only a 
few labels need to be updated after most of the other labels are fixed. Hence 
the speed of LPAm can be dramatically increased. Second, it is possible 
to introduce a threshold and then stop LPAm as soon as the modularity 
gain from the latest label propagation step does not exceed this threshold. 
Although these two heuristics have little influence on the final modularity 
value, the computational time can be reduced to a great extent (the time 
complexity of the algorithm remains the same, since the order of the number 
of iterations for the while-loop is unchanged). For example, if we apply 
these two heuristics (the threshold is set to be 0.00001) in the Condmat2003 
network, the running time is considerably reduced from 461.599s to 96.1s, 
with modularity dropped by only 0.013 (based on an average value). 

It is also interesting to note that MSG is not the only means to drive 
LPAm out of local maxima. After this work is done, we are informed that 
Blondel et al. use a reduction method (communities are reduced into nodes) 



27l | to escape the local maxima involved in another algorithm different from 



LPAm, and propose a two-phase community detection algorithm [isl]. It 
seems that such reduction method can also be used to drive LPAm out of 
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local maxima. 

Another important issue is that the solutions of LPAm+ in different runs, 
though give similar high modularity values, are not distinct in their compo- 
sitions. This phe nomenon is more obvious in large-scale networks. A very 
recent paper [29| discusses the origin of this problem. How to make the 
algorithm more deterministic is left for our future work. 

Overall, the presented LPAm+ algorithm is a suitable choice for analyzing 
community structures in networks. 
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Appendix A. The label updating rule for driving LPAm out of lo- 
cal maxima 

The label updating rule of LPAm (P) can be rewritten as: 

= argmax|^^(A„,-P„,.)5(^«,0j (A-l) 
= arg max | ^ A„^5 ^) ~ X] ^"'^ ) ^^'^^ 



vu=l 




arg max A„^.(5(/„, /) - , (A.3) 



The first term in ( lA.Sp is equal to the number of x's neighbors labeled /, and 
the second term is the product of /c^/2m {kx denotes the degree of x) and 
the sum of degrees of nodes labeled / {Di). 

As shown in section HI the toy network is divided into four communities by 



LPAm, with communities {0-3}, {4,5} and {6-9} being separated (Fig. 1(b) ). 
This division corresponds to a local maximum. But what is the reason? In 
further analysis, we find that node 4 has one neighbor labeled 'a', 'e', and 
'g' respectively, and the sum of degrees of nodes labeled 'a' or 'g' is large 
while the sum of degrees of nodes labeled 'e' is small. Consider that node 
4 is being updated. For the choice of the candidate new label / as 'a', 'e' 



13 



or 'g', the value of X]m=i ^w4'^(^m; that is the first term of (1A.3P would all 
amount to 1. Yet the value of the second term of (1A.3P kidi/2m would be 
smaller for the choice of / as 'e' than as 'a' or 'g'. According to the updating 
rule f lA.Sp . node 4 would keep its label unchanged and still select 'e' as the 



new label. Similar case is applied to node 5: it would also stick to label 'e' 
when updated. Consequently, under the current updating rule, neither node 
4 nor 5 is willing to give in first. Suppose we disregard the current updating 
rule, and change the label of node 4 from 'e' to 'a' forcedly. Though this will 
bring about a temporary decrease in modularity, it is reasonable to expect 
a greater reward from the subsequent label updation for node 5 and other 
nodes. Conceptually, we call nodes like 4 and 5 that block the system from 
further progressing as stubborn nodes. It is these stubborn nodes that results 
in LPAm getting stuck in local maxima. 

To escape the local maxima, we should attempt to let one or more of 
the stubborn nodes make a compromise to break down the blocked situation. 
Suppose is a set of stubborn nodes labeled Our trick is 

to keep ii, . . . ,ik holding the same label and update it (let ii, . . . ,ik make 
a compromise at the same time). Treating the labels of . . . , separately 
and rewriting ([2]), we have: 

\^ui^{il,...,ik} v^{ii,...,i,,} ue{ii,...,ik} ve{ii,...,ik} 



1 / " \ 

yu=l „G{ii,...,jfe} / 



(A.4) 



The first term on the right hand side of flA.4l) are independent of the label of 
ii, . . . , ifc- Hence the label updating rule to jump out of the local maxima is: 

^rC.) = argmax I B^J{L, I) j . (A.5) 

\u=l v€{ii,...,ik} / 

j£ jnew _i_ /(.^^ .^^^ the change of label for this set of stubborn nodes from 
!'{ii,...,ik) to -^^ is in effect equivalent to merging the community pairs 
labeled and -^y In real operation, instead of identifying the 

stubborn nodes and then updating their label according to ( lA.Sp . we directly 
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merge a pair of communities, choosing the one that result in the greatest 
increase in modularity. 

It is often the case that there are several sets of stubborn nodes that 
block the system from progressing. Of course, we can merge them pair after 
pair. To enhance the efficiency, we adopt the technique used in MSG (sf that 
promotes simultaneously merging of multiple pairs of communities at a time. 
The implementation detail is also discussed in jsj. 
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